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Abstract 

The THO complex participates during eukaryotic mRNA biogenesis in coupling transcription to formation and nuclear 
export of translation-competent messenger ribonucleoprotein particles. In Saccharomyces cerevisiae, THO has been defined 
as a heteropentamer composed of the Tho2p, Hprlp, Texlp, Mftlp, and Thp2p subunits and the overall three-dimensional 
shape of the complex has been established by negative stain electron microscopy. Here, we use small-angle X-ray scattering 
measured for isolated THO components (Mftlp and Thp2p) as well as THO subcomplexes (Mft1p-Thp2p and Mftl p-Thp2p- 
Tho2p) to construct structural building blocks that allow positioning of each subunit within the complex. To accomplish 
this, the individual envelopes determined for Mftlp and Thp2p are first fitted inside those of the Mft1p-Thp2p and Mftlp- 
Thp2p-Tho2p complexes. Next, the ternary complex structure is placed in the context of the five-component electron 
microscopy structure. Our model reveals not only the position of each protein in the THO complex relative to each other, 
but also shows that the pentamer is likely somewhat larger than what was observed by electron microscopy. 



Citation: Poulsen JB, Sanderson LE, Agerschou ED, Dedic E, Boesen T, et al. (2014) Structural Characterization of the Saccharomyces cerevisiae THO Complex by 
Small-Angle X-Ray Scattering. PLoS ONE 9(7): e103470. doi:10.1371/journal.pone.0103470 

Editor: Petri Kursula, University of Oulu, Finland 

Received December 16, 2013; Accepted July 3, 2014; Published July 25, 2014 

Copyright: © 2014 Poulsen et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits 
unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. 

Funding: This work has been funded by the Danish National Research Foundation's Centre for mRNP Biogenesis and Metabolism and the Lundbeck Foundation. 
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. 

Competing Interests: The authors have declared that no competing interests exist. 

* Email: deb@mb.au.dk 



Introduction 

In eukaryotes, the transcription and processing of pre-mRNAs, 
formation of messenger ribonucleoprotein particles (mRNPs), and 
eventually, nuclear export, are tightly linked processes, involving a 
plethora of protein complexes operating in synergy to deliver 
translation-competent mRNPs to the cytosol. In essence, these 
processes are regulated and sustained by numerous transient 
interactions mediated by protein-protein and protein-nucleic acid 
contacts, functioning at every step along the way from the site of 
transcription to the nuclear pore complex (NPC) [1]. Disruption of 
any of these processes can potentially cause activation of the RNA 
surveillance machinery and subsequent degradation of mRNAs in 
the nucleus [2]. 

One essential component of early mRNA biogenesis is the 
evolutionarily conserved THO complex, composed in the yeast, 
Saccharomyces cerevisiae, of Tho2p (184 kDa), Hprlp (88 kDa), 
Texlp (47 kDa), Mftlp (45 kDa), and Thp2p (33 kDa). The THO 
complex accompanies RNA polymerase II during transcription by 
binding directly to its poly-phosphorylated C-terminal domain 
(CTD) and facilitates mRNP packaging and export by an 
unknown mechanism [1,3]. However, it is known that THO co- 
transcriptionally recruits and binds the mRNA export factors 
Yralp (an RNA-binding protein) and Sub2p (an RNA helicase), 
into a larger assembly that has been termed the TRanscription and 
EXporl (TREX) complex [4]. THO also mediates interactions 
with the poly(A) + RNA-binding factor, Mex67p, essential for 
mRNA export, as well as the serine/ arginine-rich (SR)-like 



proteins, Gbp2p and Hrblp, which stimulate cotranscriptional 
recruitment of the proteins to nascent mRNA transcripts [4—6] . 
Furthermore, TREX has been found to transiendy interact with 
the transcription elongation and splicing factor, Prp 1 9p [7] . 

Depletion or knockout of individual THO complex components 
in vivo has revealed that the complex is not only involved in 
mRNA biogenesis but also takes part in preserving genome 
integrity. THO knockout phenotypes usually display decreased 
levels of nuclear mRNP production leading to stalling of 
transcription elongation, formation of RNA/DNA hybrid loops 
(R-loops), genomic instability, and eventually DNA hyper-recom- 
bination [8,9]. Deletion of THO components also triggers 
formation of large aggregates near the nuclear envelope known 
as heavy chromatin, composed of transcriptionally active chroma- 
tin, proteins of the RNA export machinery, pre-mRNA, and 
nuclear pore components [10]. Interestingly, long, GC-rich genes 
appear to be affected most dramatically in such THO mutant 
strains [11]. 

The yeast THO complex was originally characterised as a four- 
subunit assembly of the proteins Tho2p, Hprlp, Mftlp, and 
Thp2p, none of which have known structural domains or functions 
assigned. However, biochemical and negative-stain electron 
microscopy (EM) data of complexes purified from native source 
in Saccharomyces cerevisiae suggest that the WD40 repeat protein, 
Texlp (47 kDa), is stably associated as well, thus forming a 
heteropentameric structure [12,13]. Analysis of the THO complex 
by negative-stain EM has yielded three-dimensional reconstruc- 
tions of the complex both in the presence and absence of Texlp 
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allowing the position of this protein on the surface of the core 
THO particle to be accurately determined [13]. In addition, the 
C-termini of Hprlp and Tho2p were located with the help of an 
Hprlp-specific antibody and dynein-tagging of Tho2p [14]. 
However, the locations of Mftlp, Thp2p, and the N-terminal 
domains of Tho2p and Hprlp within the core THO complex have 
not been described so far. 

In this paper, we identify stable subcomplexes of S. cerevisiae 
THO (Mftlp-Thp2p and Mftlp-Thp2p-Tho2p) and use small- 
angle X-ray scattering (SAXS) to determine envelopes of 
individual subunits (Mftlp and Thp2p) as well as the subcom- 
plexes. These SAXS envelopes are then used as building blocks for 
docking all four subunits within the core THO complex. Mftlp, 
Thp2p, and Tho2p can be positioned with confidence inside the 
envelope of the ternary Mftlp-Thp2p-Tho2p complex, which is 
then used for docking into the EM model representing the entire 
THO core complex [13]. The final model reveals the position of 
each protein in the complex and further suggests that the overall 
size of the complex might have been slighdy underestimated by the 
negative-stain EM procedure. 

Materials and Methods 

Protein expression 

For expression screening of multiple THO complexes, bicis- 
tronic constructs were prepared encoding Hprlp-Mftlp and 
Tho2p-Thp2p, respectively. The Hprlp-Mftlp constructs were 
inserted by ligation-independent cloning (LIC) into the pRSF-2 
Ek/LIC bacterial expression vector (kanamycin, Novagen) while 
the Tho2p-Thp2p constructs were inserted into the pET-52b 3C/ 
LIC vector (ampicillin, Novagen) in both cases using the LIC Duet 
Minimal Adaptor (Novagen). Tho2p and Mftlp contain a primer- 
encoded C-terminal Strep II fusion tag, Thp2p an N-terminal 
Strep II fusion tag, and Hprlp a C-terminal 6xHis-tag. Binary, 
truncated MftlpAC 232 -392-Thp2p and MftlpAC 2 4 8 _ 3 9 2 -Thp2p 
complexes were inserted as bicistronic constructs into the bacterial 
pETM-13 expression vector (kanamycin, EMBL) using the Ncol 
and BamHI restriction sites. A primer-encoded Tobacco Mosaic 
Virus (TEV) protease-cleavable, C-terminal 8xHis fusion tag was 
added to Mftlp. Furthermore, MftlpAC :i:i6 _3 92 and Thp2p were 
LIC-cloned for isolated protein expression into the pET-30 Ek/ 
LIC vector (kanamycin, Novagen), adding a primer-encoded C- 
terminal 6xHis fusion tag to the protein. All constructs were based 
on standard Sacchammyces cerevisiae genomic DNA as template 
for PGR (EMD Millipore). 

The Hprlp-Mftlp and Tho2p-Thp2p constructs were co- 
transformed into E. coli Rosetta (DE3) cells and colonies selected 
on LB agar containing ampicillin, kanamycin, and chloramphen- 
icol, while the MftlpAC 232 _ 392 -Thp2p, MftlpAC 24!i _ 392 -Thp2p, 
MftlpAC 336 _ 392 , and Thp2p constructs were selected using 
kanamycin and chloramphenicol only. Protein expression was 
carried out in LB medium with induction at OD 6() o = 0.8 with 
0.5 mil IPTG and subsequent growth for 18 h at 20°C. Cells 
were harvested by centrifugation, resuspended in 20 mL of lysis 
buffer (50 mM Tris-Cl pH 8.0, 300 mM KC1, 5 mM MgCl 2 , 10% 
glycerol, and 5 mM 2-mercaptoethanol) per L of cell culture in 
addition to 1 mM PMSF and lx Complete Protease Inhibitor 
tablet (Roche). Cell lysis was achieved by sonication and high 
pressure homogenization in the presence of 1 0 u,g/ mL of DNase I 
and 40 u,g/ mL of RNase A. The lysates were finally clarified for 
purification by ultracentrifugation. 



Protein purification 

For expression-purification screening of multiple THO com- 
plexes, 5% Ni-NTA magnetic agarose beads (Qiagen) were added 
to clarified lysates (200 jj.1 beads per g of pellet dry weight 
corresponding to ~200 mL of culture volume) and incubated with 
rotation at 4°C for 1 h. Samples were washed (50 mM Tris-Cl 
pH 8.0, 1 M KC1, 5 mM MgC12, 20 mM imidazole, 10% 
glycerol, and 5 mM 2-mercaptoethanol) using a magnetic 
separator and elution (50 mM Tris-Cl pH 8.0, 300 mM KC1, 
5 mM MgC12, 250 mM imidazole, 10% glycerol, and 5 mM 2- 
mercaptoethanol) was carried out at 4°C for 1 min, after which 
the eluates were collected as supernatants using a magnetic 
separator. Elution fractions were combined with 10% Strep- 
Tactin magnetic beads (200 u.1 per g of pellet dry weight) (Qiagen) 
and incubated with rotation at 4°C for 1 h. Proteins were washed 
again (50 mM Tris-Cl pH 8.0, 300 mM KC1, 5 mM MgC12, 10% 
glycerol, and 5 mM 2-mercaptoethanol) and eluted (50 mM Tris- 
Cl pH 8.0, 300 mM KC1, 5 mM MgC12, 10% glycerol, 10 mM 
biotin, and 5 mM 2-mercaptoethanol) using the magnetic 
separator. For purification of the MftlpAC336-392-Thp2p, 
Mft lpAC336-392-Thp2p-Tho2pAC 1 274- 1 597 and 
MftlpAC270-392-Thp2p complexes, clarified lysates were loaded 
onto a Strep-Tactin Superflow Plus Cartridge column (Qiagen) 
pre-equilibrated in lysis buffer. Unbound protein was removed 
with lysis buffer, and bound proteins eluted (50 mM Tris-Cl 
pH 8.0, 300 mM KC1, 5 mM MgC12, 2.5 mM d-desthiobiotin 
10% glycerol, and 5 mM 2-mercaptoethanol). Elution fractions 
were diluted four-fold to 50 mM Tris-Cl pH 8.0, 75 mM KC1, 
5 mM MgC12, 2.5 mM d-desthiobiotin, 2.5% glycerol, and 5 mM 
2-mercaptoethanol and applied to a Source 15Q column pre- 
equilibrated in (50 mM Tris-Cl pH 8.0, 100 mM KC1, 5 mM 
MgC12 and 5 mM 2-mercaptoethanol). Elution (50 mM Tris-Cl 
pH 8.0, 1 M KC1, 5 mM MgC12, and 5 mM 2-mercaptoethanol) 
was carried out with a gradient extending from 0-100% over 
30 mL. Fractions containing the purified complexes were identi- 
fied by Coomassie-stained SDS-PAGE gels, then pooled and 
concentrated using a Vivaspin concentrator (Sartorius). Protein 
complexes were finally purified using size-exclusion chromatogra- 
phy running in 50 mM Tris-Cl pH 8.0, 100 mM KC1, 5 mM 
MgC12, and 5 mM 2-mercaptoethanol using a Superdex 200 GL 
10/300 column (GE Healthcare). Subsequent rounds of size- 
exclusion chromatography (SEC) were carried out to separate 
ternary MftlpAC336-392-Thp2p-Tho2pAC1274-1597 from bi- 
nary MftlpAC336-392-Thp2p formed following expression of the 
complex TH03. A total of ~2 mg (MftlpAC336-392-Thp2p), 
-0.1 mg (MftlpAC336-392-Thp2p-Tho2pAC1274-1597) and 
~3 mg (MftlpAC270-392-Thp2p) of purified complexes could 
be obtained per L of cell culture. To purify MftlpAC232-392- 
Thp2p and MftlpAC248-392-Thp2p, the clarified lysates were 
loaded onto a Ni-NTA Superflow Cartridge column (Qiagen) 
equilibrated in lysis buffer. Unbound proteins were washed out 
(50 mM Tris-Cl pH 8.0, 300 mM KC1, 5 mM MgC12, 20 mM 
imidazole, 10% glycerol, and 5 mM 2-mercaptoethanol), and the 
bound fractions eluted (50 mM Tris-Cl pH 8.0, 300 mM KC1, 
5 mM MgC12, 250 mM imidazole, 10% glycerol, and 5 mM 2- 
mercaptoethanol). Elution fractions treated with His- tagged 
recombinant TEV protease using a ratio of 1:100 (protease: total 
protein by mass) and dialyzed overnight against lysis buffer. TEV- 
cleaved samples were re-applied to the Ni-NTA cartridge and the 
flow-through collected. Flow-through fractions containing target 
Mftlp-Thp2p complexes were then diluted four-fold to 50 mM 
Tris-Cl pH 8.0, 75 mM KC1, 5 mM MgC12, 2.5% glycerol, and 
5 mM 2-mercaptoethanol and further purified using the Source 
15Qand Superdex 200 GL 10/300 columns (GE Healthcare) as 
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described above. ~2 mg (MftlpAC232-392-Thp2p) and ~3 mg 
(MftlpAC248-392-Thp2p) of purified complexes were obtained 
per L of cell culture. Isolated MftlpAC 33 6-392 and Thp2p proteins 
were purified as for MftlpAC 23 2- 3 92-Thp2p and MftlpAC 2 48_ 3 92- 
Thp2p, respectively, however without TEV cleavage and the 2 nd 
Ni-NTA step. ~ 1 mg of each protein was obtained per L of cell 
culture. 

Biochemistry 

Purified proteins/ complexes were concentrated for biochemical 
analysis using Vivaspin concentrators and characterized in regard 
to size and molecular weight using a Zetasizer |iV instrument 
(Malvern) connected inline to the Superdex 200 GL 10/300 
column for both static (SLS) and dynamic (DLS) light scattering 
measurements. Proteins were identified by MALDI-TOF-MS of 
peptides derived by in-gel trypsin (Sigma) digestion of excised gel 
spots. In situ proteolysis was carried using serially diluted trypsin 
for 16 h at 4°C, in which the degradation fragments were N and C 
terminally characterized by Edman degradation and MALDI- 
TOF-MS analyses, respectively. Protein complexes were designed 
and sub-cloned based on this information. Comparison of relative 
protein amounts by densitometric analyses was carried out using 
the Image Quant TL software (GE healthcare). 

Small-angle X-ray scattering 

Synchrotron radiation SAXS data were collected at the EMBL 
SAXS-WAXS beam line X33 at DORIS/DESY (Hamburg, 
Germany), the MAX-lab 1911-4 SAXS beamline and the PI 2 
beamline EMBL SAXS-WAXS at PETRAIII/DESY. The X33 
data were collected as 8x 15 sec exposures at 1.5 A using a 
MAR345 image plate detector, and scattering profiles for the 8 
passes were compared to detect radiation damage. The 1911-4 
data were collected as 2x 30 sec exposures at 1.5 A using a Pilatus 
detector, while the PI 2 data were collected as 20x 0.05 sec 
exposures and scattering profiles for the 20 passes were compared 
to detect radiation damage. Measurements were conducted at 
10°G for X33 and P12 and 25°C for 1911-4 using 10-75 uL 
sample in 50 mil Tris-Cl pH 8.0, 100 mM KC1, 5 mM MgC12, 
and 5 mM 2-mercaptoethanol. Measurements were carried out at 
3-6 different concentrations in all cases using concentrations 
between 7.8 mg/ml and 0.15 mg/ml. Background scattering was 
subtracted using ATSAS [15] and PRIMUS [16]. Linear Guinier 
plots in the Guinier region (s*R g <1.3) were confirmed in all cases 
(Figure S4). Pair distance distribution functions of the particles P(r) 
and the maximum sizes D max were computed using GNOM [17] 
and molecular weights were estimated by comparison of the 
extrapolated forward scattering 1(0) of the samples obtained using 
Guinier analysis by AUTORG [18] with that of a bovine serum 
albumin standard (Sigma-Aldrich) (Table SI). Porod volumes were 
calculated using ATSAS AUTOPOROD [18] and ab initio shapes 
were determined using DAMMIF [19], in the case of the Thp2p 
and Mftlp homodimers, both with and without a P2 symmetry 
constraint (Figure S6). After 12 DAMMIF runs, DAMAVER [20] 
was used to analyse the normalized spatial discrepancy (NSD) 
between the 12 models and the lowest NSD model was used as 
representative, except for the heterotrimeric complex where the 
filtered model was used. Envelopes were calculated using SITUS 
pdb2vol based on the DAMMIF models [21]. The model 
envelopes were docked manually in UCSF Chimera and map 
correlation coefficients were determined [22]. Enantiomer versions 
of all envelopes were tested for optimal fit. For the Mftlp and 
Thp2p homodimers, GASBOR modeling was also carried out 
using P2 symmetry for comparison (Figure S6) [23]. Multiphase ab 
initio modeling using both the Mftlp AC 336 _ 392 -Thp2p heterodi- 



mer and MftlpAC 336 _ 392 -Thp2p-Tho2pAC 1 274-i597 trimer data 
was carried out using the online version of MONSA (http:/ /www. 
embl-hamburg.de/biosaxs/atsas-online/) [24]. The measured R g 
value of 4.6 nm for the heterodimer (Table SI), a spherical search 
volume with a radius of 1 1.5 nm, and PI symmetry was imposed 
during this refinement where a total of 12 runs were compared. 

Results 

Isolation of binary and ternary subcomplexes of THO 

Initially, a His-tagged version of the intact, tetrameric Saccha- 
romyces cerevisiae THO complex consisting of the proteins Tho2p, 
Hprlp, Mftlp, and Thp2p was expressed and purified from E. coli 
Rosetta (DE3) using two compatible plasmids each encoding two 
full-length proteins (Hprlp-Mftlp in pRSF-2 Ek/LIC and Tho2p- 
Thp2p in pET52b 3C/LIC). However, using this setup, expression 
yields were quite low and Tho2p and Hprlp showed signs of 
significant degradation. To stabilize the proteins during expres- 
sion, we introduced serial N and C terminal truncations in Tho2p 
and Hprlp. Since no detailed structural information is available 
for any of the THO proteins, the truncations were based on 
sequence analysis and avoiding interruption of predicted second- 
ary structure elements. Two variants of Mftlp were also generated 
with C terminal deletions of 5 and 1 3 kDa, respectively. In total, 
the truncations yielded eight plasmids encoding bicistronic Hprlp- 
Mftlp constructs and four plasmids encoding bicistronic Tho2p- 
Thp2p constructs (Figure 1A and S1A). Tho2p, Mftlp, and 
Thp2p were expressed as Strep II Tag fusion proteins while Hprlp 
contained a 6xHis-Tag fusion, allowing for two-step affinity 
purification. The plasmids were combined by co-transformation to 
yield 32 unique combinations of the THO complex that were 
tested for expression level and stability. A complete overview of the 
constructs and combinations prepared can be found in Figure S 1 . 

All THO complex variants were screened and ranked in terms 
of expression yield and stability of the individual proteins in order 
to identify those most suitable for structural studies. Briefly, 
complexes were expressed in E. coli Rosetta (DE3) cells, purified 
by a two-step affinity purification protocol involving Ni-NTA 
magnetic agarose beads followed by Strep-Tactin magnetic beads, 
and finally, eluted and analysed by Coomassie Blue-stained SDS- 
PAGE. Figure IB (left hand side) shows the results for a subset of 
the complexes, THOl, TH03, and TH04 (see figure legend for 
details). Stable forms of Mftlp and Thp2p were observed in nearly 
all cases, whereas neither Tho2p nor Hprlp could be identified in 
these experiments. Mftlp and Thp2p were present in near equal 
stoichiometric amounts as determined by densitometric analysis of 
the gels (data not shown) suggesting formation of a two-component 
subcomplex of THO consisting of the two proteins. This 
hypothesis was further supported by the detection of Mftlp and 
Thp2p in the Strep-Tactin eluate fraction of THO 3 versus their 
absence for complex TH02 for which Mftlp lacks the Strep II tag 
(Figure SIB). In essence, the analysis showed that only the Strep II 
Tag fusion of Mftlp is functional and thus that Thp2p appears in 
the Strep-Tactin elution fraction due to interaction with Mftlp, 
not due to its own Strep II tag fusion, which in this case seems to 
be occluded. Although Mftlp and Thp2p are the only proteins 
purified in these experiments, Hprlp and Tho2p could still be 
expressed and present substoichiometrically. To examine this 
possibility, the purification was repeated using large-scale expres- 
sion cultures (Figure IB, right). Interestingly, this yielded a band 
for THO 3 migrating at the expected size for Tho2p, which was 
confirmed by mass spectrometry. This band corresponds to a C- 
terminally truncated version of the Tho2p protein, Tho2pACi274- 
1597 (Figure IB and SIC). App. 2 mg of the MftlpAC 336 _ 392 - 
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Figure 1. Purification of THO subcomplexes. A. Overview of the Tho2p, Mftlp and Thp2p constructs used relative to their full-length forms 
(boxed in grey). Blue boxes: Stretches of residues removed to obtain the construct directly below. Purple boxes: Strep II tag. B. THO complexes 
(THOI, 3, 4, and 5) were expressed and purified from £ coil in two steps and analysed by Coomassie-stained SDS-PAGE. The table shows for each 
construct the molecular weight of each protein and any associated tags (H = 6xHis, S = Strep II) as well as whether the protein is expressed (+/-). 
Positions on the gel for proteins confirmed by mass spectrometry are indicated with arrowheads and * indicates an £ coli protein contaminant. C. 
Purified THO subcomplexes analysed by Coomassie-stained SDS-PAGE: Heterodimeric TH04 (Mft1 pAC 2 7o-392 _ Thp2p), heterodimeric TH03 
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(Mft1pAC 3 3 6 _3 9 2-Thp2p), and heterotrimeric TH03 (Mft1pAC33 6 _3 9 2-Thp2p-Tho2pAC l27 4_ l597 ). D. Overlay of gel filtration chromatograms obtained 
during isolation of the ternary Mftl pAC33 6 _3 92 -Thp2p-Tho2pAC 1274 _ 1597 (Trimer) from the binary Mft1pAC 3 3 6 .3 92 -Thp2p (Dimer). Between runs 1 (blue), 
2, (red), and 3 (green), peak fractions were pooled, concentrated and re-applied to the column. Elution retention volumes are noted along with the 
positions of standards used for calibration: Blue Dextran (V 0 , void volume, 2000 kDa), ferritin (440 kDa), aldolase (158 kDa), and conalbumin (75 kDa). 
Units on the y-axis are mAU absorption at 280 nm. 
doi:1 0.1 371 /journal.pone.01 03470.g001 



Thp2p and -0.1 mg of the MftlpAC 3 3 6 _39 2 -Thp2p- 
Tho2pAC 1274-1597 complex could be purified per L of expression 
cell culture using strain TH03. In conclusion, we thus find that 
binary complexes consisting of Mftl p and Thp2p most likely result 
following expression of TH03 and TH04, i.e. MftlpAC336_392- 
Thp2p and MftlpAC 2 7o-392-Thp2p, respectively. Furthermore, 
the expression pattern observed for complex TH03 indicates that 
a subcomplex of Mftlp, Thp2p, and Tho2p could have been 
formed as well. 

Characterisation of binary and ternary subcomplexes of 
THO 

To identify and characterize the observed subcomplexes of 
THO, the strains expressing TH03 and TH04 were grown in 
larger cultures and purified using a three-step procedure involving 
Strep-Tactin chromatography, anion-exchange chromatography, 
and gel filtration chromatography. During these experiments, we 
were able to isolate both binary (MftlpAC 3 3 6 _3g 2 -Thp2p) and 
ternary (MftlpAC 3 3 6 _392-Thp2p-Tho2pACi274-i597) THO sub- 
complexes from TH03, whereas a binary subcomplex 
(MftlpAC 2 7o-392-Thp2p), containing a more truncated copy of 
Mftlp, could be isolated following expression of TH04 (Fig- 
ure 1C). Using successive rounds of gel filtration chromatography 
it was further possible to separate the ternary and binary forms 
expressed in TH03 (Figure ID). The purified complexes were 
analysed by densitometric analysis of the SDS-PAGE gels, which 
suggested that they contain the proteins in equal stoichiometric 
amounts as expected (data not shown). Furthermore, molecular 
masses of 60-70 kDa (MftlpAC 27 o-392-Thp2p) 70-80 kDa 
(MftlpAC 3 3 6 _392-Thp2p), and 220-230 kDa (MftlpAC 33 6-392- 
Thp2p-Tho2pACi274_i597) were estimated for the subcomplexes 
using static light scattering (Figure S2), which is in good agreement 
with the calculated, theoretical molecular masses of 65, 73, and 
220 kDa, respectively. 

Structural analysis of THO subcomplexes 

To identify further stable, truncated complexes of binary 
MftlpAC 2 7o-392-Thp2p and MftlpAC 3 3 6 _39 2 -Thp2p we used 
limited proteolysis followed by sub-cloning and expression. An 
overview of the proteolysis experiments leading to isolation of the 
truncated subcomplexes MftlpAC 2 32-392-Thp2p and MftlpAC 248 _ 
3 92 -Thp2p as well as the purification of these subcomplexes is 
shown in Figure S3. We next analysed the MftlpAC33 6 _3 92 -Thp2p 
binary complex and the more truncated MftlpAC 2 3 2 .39 2 -Thp2p 
form identified by limited proteolysis as well as the MftlpAC 3 36- 
392-Thp2p-Tho2pAC 1274-1597 ternary complex by small-angle X- 
ray scattering (SAXS). In addition, isolated Mftlp AC 336-392 and 
full length Thp2p were purified and analysed (Figure S3D). SAXS 
data were collected at beam lines Doris X33 (DESY, Hamburg), 
Petra PI 2 (DESY, Hamburg) and 191 1-4 (MAX-lab, Sweden) and 
data quality evaluated using Guinier plots (Figure S4). Dimen- 
sionless Kratky plots displayed the characteristic peak expected for 
folded proteins for all samples (Figure S5). The SAXS data was 
first used to calculate the maximum particle dimension (D max ) and 
radius of gyration (R g ) for each sample, which showed highly 
similar values at all evaluated concentrations for each of the 
complexes (Table SI). For isolated MftlpAC 336 _392 and Thp2p, 



inter-particle effects were observed at high concentrations and 
measurements at low concentrations suggested that they both form 
stable homodimers in solution (Table SI). 

The experimental SAXS data were next used for the 
reconstruction of twelve individual ab initio molecular envelopes 
using dummy bead modelling in the program DAMMIF. For each 
protein or complex, the most representative model was picked as 
the one having the lowest Normalised Spatial Discrepancy (NSD) 
compared to the rest of the models (Figure 2). Envelopes 
calculated without symmetry constraints for both MftlpAC33 6 . 
392 and Thp2p were elongated and exhibited quasi two-fold 
symmetry around their short axes, suggesting each half corre- 
sponded to a monomer of the homodimeric molecules (Figure S6). 
To generate the envelope corresponding to the isolated mono- 
meric molecules, we applied P2 symmetry constraints in 
DAMMIF and isolated the half-volumes corresponding to the 
monomers. These monomer volumes were output in the resulting 
DAMMIF representative model PDB files as the symmetry-related 
bead volumes and assessed for a compact globular fold as a basic 
assumption for correct monomer assignment (Figure S6). We also 
carried out modelling of the Mftlp and Thp2p homodimers using 
the program GASBOR, which consistently showed overall very 
similar structures (Figure S6). Both the D max values and the 
derived envelopes demonstrate that Mftlp is more elongated than 
Thp2p, which on the other hand seems to adopt a more globular 
form (Figure 2A and B). Compared to the symmetrical homodi- 
mers, envelopes calculated from the binary MftlpAC 2 32-392- 
Thp2p and MftlpAC 3 36-392-Thp2p complexes consistendy 
showed asymmetric envelopes consisting of a bulky and a thin 
end (Figures 2C and D). Furthermore, the model calculated for 
the more truncated version of the binary complex, MftlpAC 2 32- 
3 92 -Thp2p, showed a smaller overall Porod volume (140 nm 3 ) than 
the more intact complex, MftlpAC33 6 _3 92 -Thp2p (194 nm 3 ). 
Finally, the ternary MftlpAC33 6 _3 92 -Thp2p-Tho2pAC 127 4_ 15 9 7 
complex exhibited a substantially larger volume, triangular in 
shape with one flat and broad surface at the base and a narrow 
protrusion at the top (Figure 2E). For all complexes, the 
dimensions of the averaged envelopes were consistent with the 
experimentally determined R g and D max values even though the 
DAMMIF modelling allows expansion of the search volume in the 
simulated annealing step (Table SI). 

Placement of individual subunits in the THO complex 

Next, we manually docked the SAXS envelopes of Mftlp and 
Thp2p into the envelopes obtained for the larger binary and 
ternary complexes to determine their relative position (Figure 3). 
Initially, docking of the envelope of the smaller binary 
MftlpAC 2 3 2 _3 92 -Thp2p complex into the larger MftlpAC33 6 _3g 2 - 
Thp2p model resulted in a very good fit (map correlation of 0.933) 
with additional density in the thin end of the model to account for 
the shorter C-terminus of Mftlp in the smaller complex 
(Figure 3A). In fact, the difference envelope volume closely 
correlates with the expected size difference of ~ 1 5 kDa compared 
to the larger MftlpAC3 36 _3 92 -Thp2p envelope, suggesting it 
represents the C terminus of Mftlp (red arrow in Figure 3A). 
Next, we docked the corresponding monomer envelopes of 
isolated Mftlp and Thp2p into the two binary complex envelopes 
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Figure 2. Envelopes of THO proteins and complexes obtained by SAXS. Molecular envelopes obtained using DAMMIF for THO proteins and 
subcomplexes. A. Mftl pAC 336 . 3 92 (as isolated from the homodimer). B. Thp2p (as isolated from the homodimer). C. Mftl pAC 2 32-392"Thp2p. D. 
Mft1pAC 3 36_392-Thp2p. E. Mftl pAC33 6 .392-Thp2p-Tho2pAC 12 74. 15 97. The proteins and complexes are each represented with an envelope shown in 
three perpendicular views as indicated. Scale bars represent 100 A. 
doi:1 0.1 371 /journal.pone.01 03470.g002 



to assess the position of each subunit (map correlation 0.903 and 
0.931, respectively, Figures 3B and C). In this docking, we placed 
the thinner end of Mftl p into the thin end of the binary complex 
envelope, consistent with the shape of the isolated Mftlp protein as 
well as the truncation results described above. In addition, we 
assume that Thp2p and Mftlp dimerise in isolation due to the lack 
of a binding partner, thus, we have oriented the two proteins in the 
heterodimer in such a way that they interact using their 
dimerisation interfaces. 

In the published model of the THO complex based on negative 
stain EM, the C-terminal region of the large Tho2p protein was 
located at a narrow protrusion extending away from the core 
complex [13]. Our SAXS model of the ternary MftlpAC335_39g- 
Thp2p-Tho2pAC 127 4_ 15 97 complex also contains such a protru- 
sion, suggesting that this may represent the C-terminal region of 
Tho2p as well (red arrow in Figure 3D). In addition, the envelopes 
calculated for the binary THO complexes fit nicely along the 



"floor" of the remaining ternary envelope (map correlation of 
0.909), essentially allowing us to place all subunits in the trimer 
with good confidence (Figure 3D). Figure 4A summarises the 
proposed position and orientation of Mftlp, Thp2p and Tho2p 
inside the SAXS envelope of the trimer as supported by our data, 
with the orientation of Mftlp being supported by our truncation 
data. This model suggests that Tho2p could interact with both 
Mftlp or Thp2p. It also suggests that the interaction are likely 
facilitated by elements in the N-terminal region of Tho2p as the C- 
terminal region is located in the protrusion. Likewise, Mftlp is 
predicted to interact with Thp2p through its N-terminal region 
based on our truncation experiments. 

To corroborate this model, ab initio multi-phase models were 
prepared based on the MftlpAC :?36 . :j 9 2 -Thp2p (dimer) and 
MftlpAC33 6 _392-Thp2p-Tho2pAC 12 74-i5g7 (trimer) data using 
MONSA [24]. In this procedure, models of both heterodimer, 
trimer, and the isolated Tho2pAC 1274-1597 subunit as well as their 
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Figure 3. Docking of THO substructures. A. Docking of the Mftl pAC 232 - 3 92-Thp2p heterodimer (yellow) into the larger Mft1 pAC 336 _3 92 -Thp2p 
heterodimer (blue). The arrows indicate the proposed position of the Mftlp C-terminus. B. Docking of isolated Mft1pAC 336 _ 3 9 2 (green) and Thp2p 
(red) into the larger, heterodimeric Mft1 pAC 336 _ 39 2-Thp2p envelope (blue). C. Docking of isolated Mft1 pAC 336 . 392 (green) and Thp2p (red) into the 
smaller, heterodimeric Mft1pAC 23 2- 3 92"Thp2p envelope (yellow). D. Docking of the larger, dimeric Mft1pAC 336 _ 392 -Thp2p envelope (blue) into the 
ternary Mft1pAC 336 . 3 9 2 -Thp2p-Tho2pAC 1 27 4 . 1 597 envelope (purple). The arrows indicate the proposed position of the Tho2p C-terminus. Scale bars 
represent 100 A. 

doi:1 0.1 371 /journal.pone.01 03470.g003 



relative orientations are produced by including both data sets in a 
single analysis. The heterodimer models derived from MONSA 
displayed a low mean value of NSDs of 0.682 and the resulting 
models overlap very well with those derived from DAMMIF 
(Figure S7A). The mean value of NSDs for the trimer models 
derived from MONSA is 0.907, which is relatively high but again 
similar to the trimer model produced by DAMMIF (1.028, Figure 
S7B). Consequently, the model representing the part of the 
molecule attributed to Tho2pAC 1274-1597 also displayed a high 
mean value of NSDs of 0.857 and thus showed more variability 
compared to the Mft 1 -Thp2 heterodimer. However, in both cases, 
the models had similar dimensions and shapes to those derived 
from DAMMIF. 

Finally, we manually docked the SAXS model of the ternary 
THO complex into the structure of the five-component THO 
complex determined by negative stained EM, taking into account 
the already known positions of Texlp, Hprlp, and the Tho2p C- 
terminus (Figure 4B) [13]. Initial comparison of the envelope 
volumes suggested that our SAXS model has a greater volume for 
Mftlp, Thp2p and Tho2p than the EM reconstruction (data not 
shown). Whereas a good fit could be obtained for both Mftlp and 
Thp2p by positioning them at the base of the croissant-like 
structure extending towards Hprlp in the EM model, Tho2p was 
found to occupy a significantly larger volume in the SAXS 
envelope than in the EM reconstruction. This may, in part, be due 
to flexibility as the top-three dummy-atom models based on the 



Normalised Spatial Discrepancy values from on the pairwise 
comparison of 1 1 calculated models show some variation in this 
region (Figure S8). However, the fit of the Thp2p-Mftlp part of 
the SAXS envelope is good and furthermore, the region 
corresponding to Hprlp in the EM reconstruction is not covered 
by the SAXS envelope as well, consistent with the absence of this 
protein from our samples. In summary, by combining our SAXS 
data with the available EM reconstruction, we have been able to 
produce a model that describes the position of all subunits of the 
pentameric THO complex (Figure 5). In addition, we have 
provided structural evidence that the C-terminal half of Mftlp is 
located in proximity to Hprlp, while the N-terminal half is nearer 
Thp2p. This architectural model for the pentameric THO 
complex should provide a solid basis for design of functional and 
genetic studies to elucidate the function of the individual subunits 
during mRNP biogenesis. 

Discussion 

The THO complex is a central nuclear component involved in 
mRNP formation by linking transcription with export of 
translation-competent mRNAs by an unknown mechanism. The 
yeast THO complex is a heteropentameric protein assembly 
consisting of Tho2p, Hprlp, Texlp, Mftlp, and Thp2p, capable 
of interacting with a plethora of proteins, amongst others Yral and 
Sub2 forming the TREX complex [4,5,13]. Recendy, part of the 
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Figure 4. Placement of subunits in the THO complex. A. Model showing the proposed location and orientation of the subunits Mftlp (green), 
Thp2p (red), and Tho2p (blue) within the ternary Mft1pAC33 6 _392-Thp2p-Tho2pAC q2 74- 1 597 THO complex (purple). The proposed location of the N and 
C terminal regions of Tho2p and Mftlp are indicated. Scale bars represent 100 A. B. Comparison of the ternary Mft1 pAC 3 3 6 .392-Thp2p-Tho2pAC n2 74- 
1597 SAXS envelope with the five-component EM reconstruction of the THO complex [1 3]. The positions of the proteins not part of our sample (Hprl p 
and Texlp) are indicated. 
doi:1 0.1 371 /journal.pone.01 03470.g004 



architecture of the Saccharomyces cerevisiae THO complex was 
established by negative stain EM that was used to generate a three- 
dimensional reconstruction of the molecule [13]. The model was 
described as a croissant-like structure with a flat surface at the base 
and two large protrusions, one long and thin and the other shorter 
but wider. The EM model allowed placement of Texlp and 
Hprlp, as well as the Tho2p C-terminus using a range of 
techniques including mutational studies, immunochemistry, and 



biochemistry. Despite this effort, the positions of Mftlp, Thp2p, 
and the Tho2p N-terminus within the complex remained unclear. 

In this study, we have used small-angle X-ray scattering to 
determine envelope structures for individual THO components 
and subcomplexes, which allow us to uniquely position each 
subunit in the THO complex. Internally, all SAXS envelopes are 
consistent and can be docked into the larger subcomplexes with 
relative ease. A good fit was obtained for both Mftlp and Thp2p 
along the base of the croissant-like shape determined by EM, 




Figure 5. Overview of the composition of the THO complex. Model showing the proposed architectural arrangement of subunits within the 
intact, pentameric THO complex based on the combined EM and SAXS data. The positions and N/C terminal orientation of Thp2p and Mftlp are 
based on this study, while the positions and orientation of Hprlp, Texlp, and Tho2p are mainly based on [13]. Hatched areas indicated predicted 
flexible regions. 

doi:1 0.1 371 /journal.pone.01 03470.g005 
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however, Tho2p was found to occupy a larger volume than 
expected from the EM reconstruction. We believe there may be 
several reasons for this discrepancy. Firstly, this region of the THO 
complex may be flexible, causing it to appear larger in the SAXS 
envelope and smaller in the EM model due to the averaging of 
particles. Supporting this interpretation is the fact that Pena et al. 
determined three-dimensional EM reconstructions of a series of 
truncated THO complexes with varying lengths of the Tho2p C- 
terminus without observing any differences between the recon- 
structions [13]. This is despite the deletions comprising a 
considerable part of the protein with the largest corresponding 
to removal of residues 1274-1597 of Tho2p or more than 35 kDa. 
Consequendy, the outermost C-terminal parts of Tho2p (~320 
residues) are unlikely to be visible in the published EM 
reconstruction. This might either be due to flexibility of this 
region and consequently that it assumes different structures in the 
particles used for averaging during EM reconstruction, or as a 
result of the negative staining process. 

Pena et al. also provided the first mechanistic insights into how 
the THO complex may be recruited to co-transcriptionally active 
genes by showing that the highly basic C-terminus of Tho2p 
(residues 1279-1597) allows chromatin recruitment through 
binding of nucleic acids [13]. However, since the association of 
the THO complex with active chromatin was not completely 
abolished upon C-terminal deletion of Tho2p, it was also proposed 
that other components of the THO complex might contain 
exposed, flexible regions that could be involved in DNA binding. 
Recently, it has been shown that THO interacts direcdy with 
RNA polymerase II through its poly-phosphorylated C-terminal 
domain (CTD) [3]. To further assess whether disordered regions 
might be present in the other components of THO, we performed 
a search for disordered regions with the GlobPlot 2.3 prediction 
server using the full-length sequences of Mftlp, Thp2p, Texlp, 
and Hprlp [25]. In this search we identified the following C- 
terminal regions as potentially disordered: Mftlp, residues 330- 
396 (63 residues, ~7 kDa, pi 4.01); Hprlp, residues 703-752 (50 
residues, ~5 kDa, pi 3.85); Thp2p, residues 246-261 (16 residues, 
~2 kDa, pi 3.83), and Texlp, residues 388-422 (35 residues, ~4 
kDa, pi 12.02). The disordered region of Texlp, which is highly 
basic (pi 12.01) is adjacent to the likewise basic C-terminus of 
Tho2p in the model. We therefore speculate that the C-termini of 
Texlp and Tho2p may work in conjunction to form a hybrid 
domain structure of highly positively charged residues, possibly 
important for chromatin and/ or RNA binding. It is also possible 
that this positive patch interacts directly with the RNA polymerase 
II CTD through negatively charged, phosphorylated serine 
residues [3]. The disordered regions of Mftlp, Hprlp, and Thp2p 
are, in contrast, acidic with pi values of 4.01, 3.85, and 3.83, 
respectively. In fact, negatively charged, disordered regions occur 
more frequendy than regions of positively charged nature in 
nuclear proteins, and these regions have been proposed to take 
part in transient protein-protein interactions by means of low- 
affinity interactions [26-28]. The C-terminal regions of Mftlp and 
Hprlp may therefore constitute general interaction platforms (or 
scaffolds) required for protein-protein interactions during mRNP 
maturation. Consistent with this idea, recruitment of the export 
factors Sub2p and Mex67p to the THO complex during mRNP 
export is known to be facilitated through interaction with the C- 
terminus of Hprlp, in which the Hprlp-Mex67p interaction is 
dependent on ubiquitination of Hprlp [29-31]. Also, deletion of 
the Hprlp C-terminus results in severe phenotypes including 
impaired mRNP formation and genomic instability, together 
suggesting an important role for the Hprlp C-terminus in mRNA 
biogenesis [32]. In conclusion, we have in this paper completed 



the architectural description of THO by localising each protein in 
the complex. Our model reveals the orientation and position of 
each subunit in the complex and provides new clues to explain the 
mechanistic details of this assembly. Further experimental 
evidence will now be needed to understand the molecular 
mechanisms underlying recruitment of the THO complex to 
active chromatin during early mRNA biogenesis as well as to 
determine the detailed three-dimensional structure of the complex. 

Supporting Information 

Figure SI Overview of constructs, truncations and combinations 
of the THO complex. A. Overview of Tho2p, Hprlp, Mftlp, and 
Thp2p truncation variants. The proteins were named according to 
their full-length form (grey boxes). Blue boxes: stretch of residues 
removed in the construct directly below. Purple boxes: Strep II 
Tag. Red boxes: 6xHis-Tag. Expected molecular weights based on 
sequence are indicated. Inset: Combination of two plasmids during 
co-transformation yields a total of 32 combinations of the THO 
complex. Proteins and genes were coloured as in A. B. 10 u.g (lx) 
or 20 u,g (2x) of final eluates of THO 1-4 as analysed by 
Coomassie-stained SDS-PAGE (lanes 1-8). Lane 9: Marker 
proteins of indicated molecular weights. C. 20-100 (Xg of final 
eluates of TH03 and THOS analysed by Coomassie-stained SDS- 
PAGE (lanes 2-7). Lane 1: Marker proteins of indicated molecular 
weights. In both B and C, the expected protein molecular weight 
(MW), the fusion tag (H: 6xHis-Tag, S: Strep II Tag) and whether 
that protein is expressed (+/ -) are indicated. Arrowheads: black fill, 
Thp2p; white fill, Mftlp; gradient, Tho2p; asterisk, a protein 
contaminant. 
(PDF) 

Figure S2 Analysis of complexes by static and dynamic light 
scattering. A. MftlpAC 2 3 2 _3 92 -Thp2p. Left, refractive index (red), 
measuring protein concentration, right-angle light scattering 
(RALS) signal (green), and hydrodynamic radius (black dots) as a 
function of elution volume (mL); Right, absolute molecular weight 
based on system calibration with bovine serum albumin (BSA), as 
a function of elution volume. B. As in A, except data for 
MftlpAC33 6 _39 2 -Thp2p. C. As in A, except data for MftlpA C336 . 

392 -Thp2p-Tho2p AC 1274-1597- 

(PDF) 

Figure S3 Isolation of truncated subcomplexes of THO by 
limited proteolysis. A. 20 ug of MftlpAC 2 7o-392-Thp2p (lanes 1-4) 
or MftlpAC 3 3g_ 3 92-Thp2p (lanes 6-9) were incubated with 
increasing concentrations of trypsin for 16 hours at 4°C, and the 
fractions analysed by Coomassie-stained SDS-PAGE. Lanes 1 and 
6: Untreated Mftlp-Thp2p complexes. Arrowheads specify the 
identified stable protein fragments of Thp2p and Mftlp. M: 
Marker proteins of indicated molecular weights. B. MftlpAC 336 _ 
392 -Thp2p pre-treated with trypsin was analysed on a Superdex 
200 HR 10/300 gel-filtration column. The Mftlp-Thp2p digest is 
shown with a blue line while the intact MftlpAC33 6 _3 92 -Thp2p 
complex is shown with a brown line. The red line is a control 
sample of bovine serum albumin with a known MW (66 kDa). The 
y-axis represents mAU absorbance at 280 nm. Note: Both the 
retention volume and peak area of the digested sample is increased 
relative to the untreated sample, which suggests formation of 
several Mftlp-Thp2p complexes of lower overall molecular 
weight. Below the chromatogram is shown SDS-PAGE analysis 
(Coomassie-stained) of fractions B6-B1. Arrowheads are as in A 
and lane M shows marker proteins of known molecular weights. 
The gel shows protease resistant forms of both proteins. C. 
Purified samples of binary MftlpAC 24!i _3 92 -Thp2p (lanes 1-2) and 
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MftlpAC 2 32-392"Thp2p (lanes 3-4) as identified by limited 
proteolysis. 10 u.g (lx) or 20 |J,g (2x) was loaded in each lane. D. 
MftlpAC336-392 (lanes 1-2) and Thp2p (lanes 3-4) purified in 
isolation and analysed by Coomassie-stained SDS-PAGE. In both 
C and D the lane marked M contains marker proteins of known 
molecular weights as indicated and arrowheads specify the 
position of indicated proteins. 10 u.g (lx) or 20 u,g (2x) was loaded 
in each lane. Protein complexes were purified in four steps: (i) Ni- 
NTA chromatography, (ii) TEV cleavage and Ni-NTA chroma- 
tography, (iii) Qjanion-exchange chromatography and (iv) gel- 
filtration chromatography, while the isolated proteins 
(MftlpAC336-392 and Thp2p, respectively) were purified in three 
steps: (i) Ni-NTA chromatography, (ii) Qjanion-exchange chro- 
matography and (iii) gel-filtration chromatography. Protein 
expression was performed using the E. coli Rosetta (DE3) strain. 
(PDF) 

Figure S4 SAXS curves and Guinier plots. SAXS data obtained 
for each sample as indicated. The Guinier plots (ln(s) versus s 2 ) are 
inset to show linearity. The corresponding R g values are shown in 
Table SI. 
(PDF) 

Figure S5 Dimensionless Kratky plots. Kratky plots (s 2 *l(s) 

versus s) are shown for each sample as indicated. 

(PDF) 

Figure S6 Analysis of MftlpAC 336 _ 3 9 2 and Thp2p homodimers. 
For each sample (MftlpAC 336 _ 39 2, green; Thp2p, orange/red), the 
asymmetric (non-constrained) and P2-constrained SAXS enve- 
lopes calculated using DAMMIF as well as GASBOR are shown 
in three perpendicular directions, along with the half-volume 
representing the monomeric protein (semi-transparent surface). 
(PDF) 

Figure S7 Multiphase-modeling in MONSA. A. Overlay of 
MftlpAG j36 _392-Thp2p heterodimer representative models from 
DAMMIF (blue) and MONSA (grey) in three perpendicular views. 
B. Overlay of MftlpAC 336 _39 2 -Thp2p-Tho2pAC 12 74-i597 hetero- 
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trimer representative models from DAMMIF (magenta) and 

MONSA (grey) in three perpendicular views. 

(PDF) 

Figure S8 Dummy-atom models based on SAXS data from the 
trimeric THO complex. A. Filtered average model based on 1 1 
dummy-atom models fitted to the data with a mean Normalised 
Spatial Discrepancy (NSD) value of 1.108 and variation of 0.056. 
B. Reference dummy-atom model showing the highest resem- 
blance to the remaining models with an average NSD of 1.028. C. 
Dummy-atom model showing the second lowest NSD (1.042). D. 
Dummy-atom model showing the third lowest NSD (1.056). E. 
Overlay of the models in A-D. Some variation is observed in the 
part of the trimer corresponding to Tho2p (circled). 
(PDF) 

Table SI Molecular parameters for each sample ana- 
lysed by SAXS. The table shows the SAXS-derived parameters, 
R g , D max , and MW. The theoretical MW is the molecular weight 
calculated based on amino acid sequence and the numbers in 
parentheses are the corresponding homodimer masses. Experi- 
mentally determined MWs were based on comparisons with a 
BSA standard sample ( based on Porod volume). 
(PDF) 
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